LAPACK Working Note UT CS Parallel numerical linear algebra
نویسندگان
چکیده
We survey general techniques and open problems in numerical lin ear algebra on parallel architectures We rst discuss basic principles of parallel processing describing the costs of basic operations on parallel ma chines including general principles for constructing e cient algorithms We illustrate these principles using current architectures and software systems and by showing how one would implement matrix multiplica tion Then we present direct and iterative algorithms for solving linear systems of equations linear least squares problems the symmetric eigen value problem the nonsymmetric eigenvalue problem and the singular value decomposition We consider dense band and sparse matrices
منابع مشابه
Enhancing Parallelism of Tile Bidiagonal Transformation on Multicore Architectures Using Tree Reduction
Abstract. The objective of this paper is to enhance the parallelism of the tile bidiagonal transformation using tree reduction on multicore architectures. First introduced by Ltaief et. al [LAPACK Working Note #247, 2011], the bidiagonal transformation using tile algorithms with a two-stage approach has shown very promising results on square matrices. However, for tall and skinny matrices, the ...
متن کاملLAPACK Working Note 41 Installation Guide for LAPACK 1
This working note describes how to install, test, and time version 3.0 of LAPACK, a linear algebra package for high-performance computers. Separate instructions are provided for the Unix and non-Unix versions of the test package. Further details are also given on the design of the test and timing programs. 1This work was supported by NSF Grant No. ASC-8715728. formerly Susan Ostrouchov
متن کاملComparative Study of One-Sided Factorizations with Multiple Software Packages on Multi-Core Hardware LAPACK Working Note ♯217
The emergence and continuing use of multi-core architectures require changes in the existing software and sometimes even a redesign of the established algorithms in order to take advantage of now prevailing parallelism. The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) is a project that aims to achieve both high performance and portability across a wide range of multi-c...
متن کاملAccurate SVDs of Structured Matrices
We present new O(n) algorithms to compute very accurate SVDs of Cauchy matrices, Vandermonde matrices, and related \unit-displacement-rank" matrices. These algorithms compute all the singular values with guaranteed relative accuracy, independent of their dynamic range. In contrast, previous O(n) algorithms can potentially lose all relative accuracy in the tiniest singular values. LAPACK Working...
متن کاملParallel Block Hessenberg Reduction using Algorithms-By-Tiles for Multicore Architectures Revisited LAPACK Working Note #208
The objective of this paper is to extend and redesign the block matrix reduction applied for the family of two-sided factorizations, introduced by Dongarra et al. [9], to the context of multicore architectures using algorithms-by-tiles. In particular, the Block Hessenberg Reduction is very often used as a pre-processing step in solving dense linear algebra problems, such as the standard eigenva...
متن کامل